Midwestern University¶We of course recognize that there is no world in which we are making these big clinical decisions solely based off of age, however, including age in the guidelines would be useful for making an informed decision, allowing one to perhaps pursue other more viable avenues or implement maneuvers that have higher success rate or could buy time to make it to the OR where success rate is higher.
Additionally, preserve resources, blood products and keep providers out of the way of unnecessary harm.
ICD: 34.02 - Exploratory Thoracotomy15 minutes of arrival to the EDPenetrating or Blunt2008-2012 from NTDB were used¶N = 2,58592.8%%92.3%94.9%99.4% (N= 140; There was a single individual >60 y/o who survived)80% (N=20)
A recent study investigating penetrating cardiac injuries in NTDB devel- oped a predictive model for outcomes with a predictive power of 93% that was more robust than its previous counterparts. Advances in machine learning, predictive analytics, and increased standardization of large databases such as the NTDB will allow us to better utilize evidence-based medicine for critical decision- making and improved patient outcomes.
2007 - 2022 were requested and received from the NTDB¶.csv format was used for this projectPython 3 along with various packages, namely pandas and numpyINC_KEY or Incident Key which is assigned to each patientText out of the integer based labeling system used by certain categorical datafields within the data.pdf) that describes what a lot of the data fields are, what file they can be found in, what years they were actively used and if they are renamed from previous years so you can map this data back to those
2016 - 2017 along with transition from ICD9 to ICD10, in which the word Thoracotomy does not exist (there is an entire paper in which some very upset indivudals complained about the implications of this).¶
The variability in which different datafields is largely prior to 2016, with the exception of the massive changes that happen during 2017.
I have spent no less than 60 hours on this task alone of this summer tinkering with the data files in this 2007 - 2016 data range to play nicely with anything from the 2017 - 2022. I've tried many methods to both automate and borderline manually combine this data with the newer data, however, for at least 10 large technical reasons and to little avail, I have not been successful in this endeavor (at least not entirely)
For a subset of datafields that exist across all of the years of data (2007-2022) I can seemingly get it to mesh reasonably well with the newer data, HOWEVER, when I plot the mortality rates of the data in the 2007 - 2016 range, it is excessively low (~30-50%).
Clearly there is something that is wrong with my filtering of the data that is not capturing all of the deceased cases or is including an excessive number of survivor casese that should realistically not be included in this calculation.
Going forward, I will continue to attempt to tinker and resolve this problem as I think it would approximately double the sample size of what is ultimately used (at this time) for this project (~7-8k).
ICD10 codes anymore?¶https://www.icd10data.com/), and helpful input for ideas to look for from Dr. Schlanser and Julian Henderson, we tracked down and isolated a list of ICD10 procedures that in the context of the ED and being performed within the first 15-20 minutes of arrival at the ED, would presumably only be done in the context of an RT (and also excluded all sternotomy procedures) were able to isolate a lot of patient's who likely underwent RT.
2013 there were a new subset of fields that had been sneakily introduced that included Hemorrhage Control Surgery along with the times following arrival at which they were subsequently performed.
Thankfully, these do include the procedure Thoracotomy, as well as Sternotomy (exlcuded) amongst other damage control surgeries. The listed procedure is supposed to be the initial procedure performed, so if Thoracotomy is listed, this does not rule out that a Sternotomy was not done but it implies that it was not the first procedure or the primary one.
Aside from the initial damage control procedure listed, there are no additional fields detailing if and what any subsequent damage control procedures were performed (i.e. if another damage control procedure was done first and then a thoracotomy was done after this, we would not know that information and this patient would be excluded)
Ensuring not to double count individuals who had both the Hemorrhage Control Surgery labels and relevant ICD10 codes and only counting individuals who had multiple ICD10 codes once, I obtained a unique list of INC_KEY, grabbed all of the patient data on them from 2017 - 2022.
For patients who did not have Thoracotomy as their damage control procedure but did have ICD10 procedures that were suggestive of a RT, I selected these patients (many of which had multiple of these procedures which is not surprising; getting intracardiac epi and a left visual inspection of the heart, open approach makes sense) and then selected the minimum time out of their RT related procedures as the time to RT.
There are instances in which for these cases had another damage control procedure done (i.e. sternotomy, laparotomy etc.) with a time to when that procedure was started and sometimes this damage control surgery took place prior to these procedures we are assuming are the result of a thoracotomy.
Is that there may have been damage control procedure other than RT performed prior to the RT but I am overriding the time stamp to reflect when we are assuming the RT was started as opposed to when any damage control started.
Sternotomy as indicated by ICD10 codes or by the damage control label.Strictly use the cases where it is clearly labeled a Thoracotomy being used as a damage control procedure and exclude all other cases from the study (maintain the purity of the data at the cost of sample size)
Include all of the ICD10 code Thoracotomy related procecures but that are not explicitly labeled a Thoracotomy by the damage control surgery label (current implementation, more samples but possibly lower purity).
The reason I want to know is because mixing in the ICD10 code data makes processing more complicated (although once it's done you don't have to think about it again if done correctly in the initial stages of the project, which I think I have done but again, more places to make mistakes with this route).
Thoracotomy done at all into the realm of possibility, although for these individuals certainly can't be excluded.Hours instead of Minutes and as a result these cases have times that were converted to minutes by me prior to filtering out by timeHMRRHGCTRLSURGTYPE¶ICD10 and HMRRHGCTRLSURGTYPE¶
Although I think that we would get the extra 1.3k cases by including the cases in which RT was likely based on ICD10 codes alone, I think maintaining the purity of the data would be better since such small swings in mortality % dramatically changes how you might interpet these figures.
From our data here, although we need more data for some of the extremes of age, can likely say that for those over the age of 70, RT is not typically survivable.
In addition, those who survive and have blunt MOI vs penetrating and undergo RT have substantially longer recovery roads ahead of them as well as longer time on ventilators and ICU stays
Those who do not survive, although not exclusively, typically have worse vital signs in the pre-hospital and initial presentation setting. Of note, the lines we have set in trauma as being critical threshold for vital signs seem to make clinical sense with the data presented here.
Of the surviving individuals, they did not have their RT until around 9-10 minutes after arrival whereas the deceased had their Rt within 5-6 minutes, likely indicating the severity of their injuries. The surviving group likely has more time to stabilize patient in other ways prior to initiating RT or patient arrives in better condition leading to overall better outcomes (hypothesis).
Those who survived received a larger volume of blood products in the first 4 and 24 hours compared to the decased, however, this is likely skewed simply due to the fact that the vast majority of patients do not survive the first 30 minutes after arrival and thus are not transfused these larger volumes of products.
With the exception of outliers, all of the deceased had a initial ED GCS of 3, whereas those that survived had a range of 3-15.
Comorbid conditions did not seem to yield a lot of meaningful information given that the trauma population is largely young and reasonably healthy in addition to the lack of information that was able to be obtained from the pt/EMS due to patient condition or lack of records.
Alcohol and recreational drugs did not seem to provide any type of survival advantage and it seems typically was associated with worse outcomes.
At the age extremes, I think we need a lot more data to more definitive guidelines but I think this is still important and informative information
Part of fixing the above is getting the older NTDB to work with the newer stuff as this would at least 2x the samples available to study (some of the side pieces of information is not present in the older data so secondary analysis may be tricker)
Propose the addition of Transport time and minutes of CPR prior to arrival as new data fields that should be included in NTDB (would make it possible to evaluate the quality of decision algorithms for RT made by EAST/WTA going forward to see if those who do/do not meet those criteria and undergo RT have better or worse outcomes) but need these two data points to do that well.
Request that vital signs data be put back into NTDB instead of having to get a NEMESIS ID from the NTDB and then go try and retrieve it from NEMESIS.... (starting in 2021 EMS vitals are no longer tracked in the NTDB).
Would like to investigate the outcomes based on the subprocedures being performed (i.e. intracardiac epinephrine, cross-clamping the aorta etc.) to see if there are differences in outcomes when these are implemented (are these helping or hurting, or is it a matter of the severity of the patients injury that the mortality speaks about more than it does the procedure efficacy)
It would be interesting to see how far back this NEMESIS database goes back and see if it does happen to include timing data on CPR in the pre-hospital setting to then look at outcomes of patients who received RT in compliance or not with the guidelines put forth by WTA/ EAST
Investigate and compare sternotomy to RT that is performed on one or both sides and see if there are any major differences
What are the things about RT that you are personally curious about, think should be investigated or you think are areas of potential improvement
Would love some feedback on if there is a feel for which statistical tests should be used to determine significance for a lot of this data, it seems like there are a lot of options, including those used in the Loyola paper, but I imagine it depends on how deep the weeds you want to get.
Make a machine learning algorithm that predicts whether an individual will survive a RT based on vital signs and basic information about the individual that you would have available to you prior to arrival or very quickly after arrival.
Similar to the idea of a MedCalc tool (for example sepsis/ SIRS criteria)
Although less pertinent at Cook County, a smaller community hospital who doesn't regularly perform RT may need PTA to mobilize appropriate resources, prepare team and staff for arrival of patient etc.
Knowing upfront before arrival that you may or may not go down the avenue of using RT could be useful knowledge to be armed with
Once you have their initial vital signs you could potentially use this to determine after the RT if they were expected to survive or not as a form of quality improvement (if the model thinks they should have been able to tolerate the RT, were there things that could have been done better, prepared for etc in these cases. Not pointing fingers or necessarily implying it was anything other than more specifics about the circumstances that can be captured by the model)
Think back to organic chemistry (maybe don't do that) and remember Beer's Law in which you would create a serially diluted standard solution with known concentrations and you would use that with your spectrometer to fit a line for predicting the concentrations of an unknown solution.
When you fit that line you did a form of regression (almost there!)
x), there are many:
y = a1b1 + a2b2 + a3b3 ..... anbn + c where an is a weight (very much like m from above) times the value of a feature, b1.
machine part of machine learning is whatever algorithm you are using to learn relationships about your data to tune your weights (an) so that the sum of all of these paired terms and the intercept, c, equal a prediction.
Survived/deceased? = (a1 x EMS SBP) + (a2 x EMS Pulse Rate) + (a3 x EMS Respiratory Rate) + (a4 x EMS Total GCS) + c,
where the values an are your weights and c is an intercept determined by the algorithm to fit your data.
penalty to the weights that dictates how big or small they are allowed to be or how many of them will be forced to equal 0, you would have algorithms called RIDGE, LASSO, or ElasticNet (this one is basically a combination of both RIDGE AND LASSO and you tune a value to determine how much of each you want).LASSO
Lastly, there is very well studied, popular algorithm called XGBoost, which I have used in the app as well and it performs the best out of all of these.
v0.0 is pretty straightforward, you input all of the EMS data after selecting which algorithm you want to use and click Predict at the bottom and it will make a prediction along with returning the Confidence which is the probability with which the algorithm thought the outcome was the right answer.
Basically the model assigns Deceased a value of 1, Survived a value of 0, and returns a number that is a percentage of how likely it thinks each class is to be the correct answer.
For example, a patient who sustained blunt trauma, is a 25 y/o male and had initial EMS vital signs of 0 SBP, 0 pulse rate, 0 respiratory rate and GCS of 3, the algorithm will return a value of 0.11 for the class 0 (Survived) and 0.89 for the class 1 (Deceased).
0.89 converted to a perctage is the Confidence that is the output along with the predicted class.¶v0.0 is simple and effective but has some drawbacks.In the meantime, I am still working through the kinks of v1.0 which allows you to select whether you want to use information from EMS, ED or Both settings
Better yet, you can input any combination of at least 2 pieces of information or more and get a prediction (be mindful that obviously the less information given the less reliable your prediction will likely be, although this may not necessarily be reflected by the Confidence, I would like to find a better metric to use that incorporates the number of input datafields as a parameter)
In order to do this you have to train a different machine learning model for every single combination of algorithm, input data fields, other parameters the machine learning algorithms use for training, also called tuning the hyperparameters
Yes that's right, not only do you have different models, but each model has several variables where you have to make a combination for each of these hyperparameters, train the models on a subset of the data (which can not have missing data in it) to find the best hyperparameters, and then you take those and evaluate the model on one last held out group of data that the algorithm hasn't seen before.
Also, if you were wondering how do you prevent your algorithm from becoming biased towards just predicting that everyone who gets a RT to be deceased since >90% of the time that would be a good assumption? You artificially try to make there be a approximately equal number of decased and survived patients in your training data so that you shield your algorithm from this bias.
Dummy algorithm that is going to do exactly that: predict that the outcome is death no matter what the input is, and I will use this as a baseline for which my models should have to perform better than to determine that they are actually any good (as well as being better than random guessing or an AUROC >0.5)¶
Mechanism of Injury); you just convert each unique instance of categorical variable to an integer and then save a dictionary with these mappings that you use to translate back their meaning on the back end if you need to.¶
285k different models to make this application work (I used to have access to a super computer (MSU HPCC) which was basically the equivalent of 50k desktops put together that you could command at will to do massive amounts of code.¶~50.5 hours and churned out around 40k of the models, during which time I thought it might combust) and the desktop that I built a few years ago (~8X computing power).¶
So this is where v2.0 comes in, my PC was needed so that my laptop wasn't running this code for 10+ days in a row. Turns out MacOS and Windows do not play nice when sharing your code between the two systems (I've never done before and will hopefully never do again), so the v2.0 is still in progress and running the models as we speak.
Additionally, there is a bug that I have not determined the cause of yet in v1.0 that is predicting the same example patient I presented earlier as having a 99% chance of survival (we all know that's not how that one is going to pan out, sorry kid) so I need to figure that out, although by the time v2.0 finishes, this may not be necessary at all.
The last but a certainly not least point that is to be made is that currently I am paying a $10 per month service to be able to host the website, however, this is only viable in the short term as it is still basicall being hosted from the Terminal in my laptop, so if my laptop is not running the code for the app, then the app does not exist online (problem)
In the future I will need to find a way to host this on a cloud server (something like Amazon Web Services (AWS)) that way it can be up all the time
There's not a lot here yet as I still need to cultivate some effective ways to objectively measure how good the algorithms are
Need to be able to host the app without my laptop running around the clock like a guinea pig on a wheel
There could be a potential ability to allow other people to submit the outcomes of RT and create a database just with that purpose in my mind that could be used to improve the models (obviously de-identified and all the other hoops)
There are probably better methods out there that are more complex and advanced, and there is a lot I can do to mine right now to improve them without switching algorithms as an alternative avenue first
A concern is that this is all quite technical and although I have some small experience, having more experienced eyes on a project like this would probably be good so if this were to become an academic pursuit that would likely need to be an important consideration at some time in its development.